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Abstract 

When we try to solve a system of linear equations, we can consider a simple iterative algorithm in 
which an equation including only one variable is chosen at each step, and the variable is fixed to the 
value satisfying the equation. The dynamics of this algorithm is captured by the peeling algorithm. 
Analyses of the peeling algorithm on random hypergraphs are required for many problems, e.g., 
the decoding threshold of low-density parity check codes, the inverting threshold of Goldreich’s 
pseudorandom generator, the load threshold of cuckoo hashing, etc. In this work, we deal with 
random hypergraphs including superlinear number of hyperedges, and derive the tight threshold 
for the succeeding of the peeling algorithm. For the analysis, Wormald’s method of differential 
equations, which is commonly used for analyses of the peeling algorithm on random hypergraph 
with linear number of hyperedges, cannot be used due to the superlinear number of hyperedges. A 
new method called the evolution of the moment generating function is proposed in this work. 


1 Introduction 

The peeling algorithm is a simple message passing algorithm on hypergraph, which has been used for 
analysis of many practical problems, e.g., the decoding of low-density parity-check codes |T], the satisfi¬ 
ability and clustering phase transition of random fc-XORSAT [2], load threshold of cuckoo hashing [3], 
invertible Bloom lookup table [3], etc. The peeling algorithm works on a bipartite graph representation 
of a hypergraph consisting of vertex nodes and hyperedge nodes. In the d-peeling algorithm, hyperedge 
nodes of degree at most d— 1 are iteratively removed. In this work, we consider the peeling algorithm 
on randomly generated fc-uniform hypergraph with superlinear number of hyperedges where sublinear 
number of vertices are initially removed. Problems of this type were considered in 0 , m- The results 
of this paper are useful for analyses of message passing algorithm for planted MAX-fc-LIN and planted 
uniquely extendible constraints satisfaction problems 0 , m and the inverting algorithm for Goldre¬ 
ich’s generator [8]. For analyses of the peeling algorithm, two methods have been used in the previous 
works: the density evolution [9] and Wormald’s method of differential equation m, m- The density 
evolution is not available on our setting since the hypergraph is not locally tree due to the superlinear 
number of hyperedges. Wormald’s method is also not available since the numbers of hyperedges with 
particular degrees in the peeling process are highly biased, e.g., the number of degree-1 hyperedge 
nodes is sublinear while the number of degree-3 hyperedge nodes is superlinear. The analysis in this 
work is founded on the Markov chain of the number of hyperedge nodes which has been also used 
in Wormald’s method im. 0- We analyze the peeling algorithm by introducing the evolution of the 
moment generating function, which gives the precise analysis of the behavior of the peeling algorithm. 
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Figure 1: A bipartite graph representation of a 3-uniform hypergraph. 

2 Main results 

In this work, we consider randomly generated hypergraphs. 

Definition 1 (Random hypergraph). A random hypergraph Gk(n,m(n),£(n)) is defined by the fol¬ 
lowing generating process. First, /c-uniform hypergraph is generated by choosing m(n) hyperedges 
independently and uniformly from all of the (^) size-/c subsets of n vertices. Second, £{n) randomly 
chosen vertices are removed from the /c-uniform hypergraph. Equivalently, we can assume that the 
£(n) vertices with smallest indices are removed. 

In this paper, we always assume £(n) £ <u(l) H o(n). For a given hypergraph generated randomly 
as above, the d-peeling algorithm, that we consider in this paper, is an algorithm iteratively removing 
hyperedge nodes of degree at most d — 1 until no such node exists (see Definition [4] for the formal 
definition). The behavior of the /c-peeling algorithm on Gk(n,m(n),£(n)) is essentially determined by 
the connectivity of the random /c-uniform hypergraph Gfc(n, m(n), 0) since vertices of G&(n, mfn),£(n)) 
removed by the /c-peeling algorithm are those which were connected to some of the £{n ) vertices 
removed from the random /c-uniform hypergraph. Hence, the asymptotic behavior of the /c-peeling 
algorithm on Gk(n,m(n),£(n )) is derived from the phase transition phenomenon of the connectivity 
of Gfc(n, m(n), 0) |I2] (See also Appendix [C]). In this work, we show the phase transition phenomenon 

of the d-peeling algorithm for d < k — 1. Let the threshold constant be p. c (/c, r) := ( r ) i • The 

followings are the main results of this paper. 

T*_1 

Theorem 2. Let m(n ) = ^f or arbitrary constant /i > /a c (k,r). Then, the (k — r + 2 )-peeling 
algorithm removes n — o(n) vertices of Gk(n,m(n),£(n)) with high probability for r € {3,... ,k}. In 
addition, if m(n) = w(nlogn), i.e., £{n) = o(n/(log n) 1 ^ r ^ 2 ^), the (k — r +2) -peeling algorithm removes 
all vertices of Gk{n,m(n),£(n)) with high probability. 

if* _q 

Theorem 3. Let m(n ) = g- ^ n y-i f or arbitrary constant /i < fj, c (k,r). Then, the (k — r + 2 )-peeling 
algorithm removes only Q(£(n )) vertices of Gk(n, m(n),£(n )) with high probability for r € {3,..., k}. 

The above results show that p c (k, r) is the sharp threshold constant for the behavior of the peeling 
algorithm. Furthermore, upper bounds of the rate of the large deviation and the number of removed 
vertices below the threshold are also obtained in this paper. 
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3 Bipartite graph representation of hypergraphs, peeling algorithm 
and stopping sets 

In this work, a hypergraph is represented by a bipartite graph. The bipartite graph representation 
consists of two types of nodes “vertex nodes” and “hyperedge nodes” each of which corresponds to a 
vertex and a hyperedge in the hypergraph, respectively. A vertex node v and a hyperedge node e are 
connected by an edge in the bipartite graph representation if and only if the vertex corresponding to 
v is a member of the hyperedge corresponding to e in the hypergraph. The example of the bipartite 
graph representation is shown in Fig. [U The set of vertex nodes and the set of hyperedge nodes are 
denoted by V and E, respectively. The neighborhoods of vertex node v £ V and the neighborhoods of 
hyperedge node e £ E are denoted by dv C E and de C V, respectively. 

Definition 4 (Peeling algorithm for a bipartite graph). For d £ {2,3,. .., k}, the d-peeling algorithm 
for a bipartite graph is defined as follows. If there is a hyperedge node e £ E of degree at most d — 1, 
then the hyperedge node e and all of the at most d — 1 vertex nodes connected to the hyperedge node 
e are removed from the bipartite graph. This process is iterated until there is no hyperedge node of 
degree at most d — 1. 

Note that on similar settings, the peeling algorithm was analyzed for k = 3 and d = 2 in [6], 

[5]. The peeling algorithm stops if and only if the current set of variables forms a structure called a 
stopping set. 

Definition 5 (Stopping set [I3j). For d > 2, a subset S C V is called a d-stopping set if \de n S\ £ 

{0, d, d + 1,..., A;} for all hyperedges e £ E. 

It is obvious that the d-peeling algorithm terminates at the largest d-stopping set. Hence, it is 
sufficient to analyze the existence of non-empty d-stopping sets for analyzing the d-peeling algorithm. 

We classify non-empty d-stopping sets to three classes according to their size; a-small d-stopping sets 
whose size is at least 1 and at most [an], a-linear d-stopping sets whose size is at least [an] + 1 and 
at most [(1 — a)nj and a-large d-stopping set whose size is at least [(1 — a)nj + 1 for some fixed 
a €(0,1/2). 

4 Analysis of stopping sets 

In this section, we show the results of analysis of existence of stopping sets which reveals the behavior 
of the peeling algorithm. As shown in Appendix [A] by the standard analysis using Markov’s inequality 
and expected number of stopping sets, it is easy to show that there is no a-linear d-stopping set if the 
number m(n) of hyperedges is superlinear. 

Lemma 6 (Linear-size stopping sets). For any d £ {2,3,... ,k} and a £ (0,1/2), there exists (3 > 0 
such that Gk(n, (3n,0) does not have a-linear d-stopping set with probability 1 — exp{0(n)}. 

Similarly, it is also shown in Appendix [B] that there is no a-small stopping set if m(n) = w(nlogn). 

Lemma 7 (Threshold for small stopping sets). For any d £ {2,3,... ,k} and a £ (0,1/2), Gk{n, /in log n, 0) 
does not have a-small d-stopping set with probability 1 — 0(n~ s ) for any p > 1/k and 6 £ (0,pk — 1). 

Conversely, if m(n) = /in log n for p < 1/k, from the theory of the coupon collector’s problem, with 
high probability there exists a vertex node which is not connected to any hyperedge node. Therefore, 
there exists a d-stopping set of size 1 with high probability. Hence, the constant 1/k, which appears as 


3 


a coefficient of n log n, is the sharp threshold for the existence of small stopping sets. While the above 
two Lemmas are obtained by Markov’s inequality and analysis of expected number of 2-stopping sets, 
the analysis of a-large d-stopping sets requires more involved analysis of dynamics of the d-peeling 
algorithm. Recall £(n) G w(l) D o(n). The followings results on large stopping sets are shown in the 
next section. 

^_q 

Theorem 8. Fix r > 3. For any constant p < p c (k,r), Gk(n, H p n y -2 >^( n )) has (k — r + 2)-stopping 
set of size larger than n — (1 + r)£(n) with probability at least 1 — p(n, p,r) for any r > r* where 
t* G (0, l/(r — 2)) is the unique solution in (0, l/(r — 2)) of 

1 T* 

V ~ (jy r( l + r .)r-l- 

Here, the probability p(n , p, r) is 

exp < inf {(^fc r (//,A,T / ))£(n) + 0(max{£(n) 2 /n,l})> 

[A>0,t'6(t*,t) ’ J 

where 

<Pk,r(p, A, t) := p (exp{(A; - r + 1)A} - 1) f ^ ^ (1 + r) r_1 - At. (1) 

T — 1 

Theorem 9. Fix r > 3. For any a G (0,1/2) and for any constant p > p c (k,r), G]~{n , P £? n y-‘i ; l(ra)) 
does not have a-large (k — r + 2)-stopping set with probability at least 

1 — exp < sup inf r (p, A, r)} £(n) + 0(max{f (n) 2 /n, logT(n)}) > . 
t r>0 A<0 ’ J 


ddere, it holds 


1 — (r—2)r 

exp <j sup inf {pk,r(p, A, r)} } = p —1 

r>0 A <° 


where (p, t) is the solution of 

p = exp ^p^ 1 ^ (1 + T) r ~ 2 (k - r + 2) (p- 


k—r+l 


- 1 


pp 


k— r+1 


C)r(l + T) 


r —1 


(2) 

(3) 


Note that for p > (, k(k — 1)) _1 , Gk(n,pn, 0) has a giant component whose size is concentrated 
around (1 — p)n where p satisfies ([2j) for r = 2 (H|. Hence, the equations (J2J) and Q may give the 
generalized concept of “size of giant component” (See also Appendix ICl) . 


5 Evolution of the number of hyperedges in the peeling algorithm 

5.1 The Markov chain 

In this section, we analyze the numbers of hyperedges at each step of the iterations of the (k — r + 2)- 
peeling algorithm on Gk(n,m(n),£(n)). In this section, we deal with arbitrary fixed r > 2. For the 
analysis, we assume that only one hyperedge node e G E of degree at most k— r+1 is chosen in each step 
and that only one of the vertex node connected to the hyperedge node e is removed from the bipartite 
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graph. Note that the scheduling of the peeling algorithm does not affect to the remaining graph after the 
termination of the peeling algorithm. Let Cj(t) be a random variable corresponding to the number of 
hyperedge nodes of degree j after t iterations for j G [k] := {1,2,..., k} and t G {1,2,... }. Obviously, 
[G'o(0),..., 0^(0)] obeys the multinomial distribution Multinom(m(n),po( r i))Pi(^)) • • • iPk{n-)) where 


Pj{n) 





\jJ nr 3 \ n K J 


Let B 2 (t),... , Bk- r +i{t)] be a 0-1 random vector of weight 1 where Bj(t) = 1 if a hyperedge 

node of degree j is chosen at (t + l)-th iteration and Bj(t ) = 0 otherwise. We assume that a hyperedge 
node is chosen uniformly from all hyperedge nodes of degree at most k — r + 1. Hence, 


Pr (*,■(*) = 1 | [Cb(i), • • •, C k m = ll ' 

Z^j'=i tj'v) 

if Cj(t ) > 1. Note that the distribution of [Bi(t), - 62 (f), ... ,B k _ r+ i(t)] is not used in the fol¬ 

lowing analysis. Let N(t) := n—l(n)—t be the number of remaining vertex nodes after t iterations when 
the iterations continues until the f-th step. The set of random variables ([Co(t),..., Ck(t)]) t= o,i,...,jv(o) 
is a Markov chain satisfying [Co(t + 1),..., C\(t + 1)] = [Co(t), ..., Ck(t)] if Cj(t) = 0 and 


Ck{t + 1) — C k (t) — R k (t) 

Cj(t + 1) = Cj(t) — Rj(t) + R j+ i(t), for j = 1,2,... ,k - 1 (4) 

Co(t + 1) = C'o(t) + Ri(t) 


if Cj(t ) > 1 where Ri(t),... ,R k (t) are independent random variables conditioned on [Co(t),..., C k (t)] 

and [Bi(t ),..., B k _ r+1 (t)} obeying 

Rj(t) ~ Binom ^ 'Cj(t ), > ^ or j = k — r + 2, k — r + 3,..., k 

Rj{t) ~ B j{t) + Binorn (cj{t) - Bj{t), , for j = 1,2,... ,k — r + 1. 

Let E^~ r+1 (t) := YljZ\ +1 jCj(t) be the number of edges connected to hyperedge nodes of degree at 
most k — r + 1. Then, the probability that G k (n, m(n),£(n)) does not have (k — r + 2)-stopping set of 
size larger than n — £(n) — t is exactly equal to 

Pr (£f- r+1 (0) > l,E\- r+1 { 1) > 1,... ,E\~ r+1 {t - 1) > l) . (5) 

For proving Theorems [8] and [9] we analyze the probability (|5|). Similar analysis was considered in [I], 
m, 0 , in which the number of hyperedge nodes m(n ) is proportional to n. In that case, one can use 
Wormald’s theorem, which gives differential equations describing the behavior of the Markov chain at 
the limit n —> 00 m- In this paper, m(n ) is not necessarily proportional to n. Therefore, different 
techniques are required. 


5.2 Dominating Markov chain 

In this section, we prove Theorem [8l For the Markov chain (|4|) , it holds 
C k {t + 1) = C k (t) — R k (t) 

Cj(t + 1) = Cj(t) — Rj(t) + Rj + \(t), for j = k — r + 2, k — r + 3,..., k — 1 

k—r+l (®) 

E k 1 - r + 1 (t + l) = E k 1 ~ r +\t)- ]T R j (t) + (k-r + l)R k _ r+2 (t) 

3= 1 
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if 1+1 (t) > 1. For upper bounding 0, we consider the dominating Markov chain {[e\ f+1 (t), 

C k _ r+2 (t),..., C'fc (t)])t=o, 1 ,—,JV(o) which satisfies e\ ' +1 (0) = F^ _r+1 (0), Cj( 0) = 0,(0) for j = k- 
r + 2,.. ., k and 

Cfc(t + 1) = C'fc(t) 

Cj(t + 1) = C'j(t) + Rj + i(t), for j = k — r + 2, k-r + 3, — 1 (7) 

^' r+1 (t + 1) = £j~ r+1 (i) - 1 + (A; - r + l)14_ r+2 (f) 

where 

h’j(t) ~ Binom > for j = k — r + 2,k — r + 3, ...,k. 

The dominating Markov chain does not have the conditioning E l (t) > 1 which appears in ©. 
Hence, it is easier to analyze the dominating Markov chain (0 than to analyze the original Markov 
chain ©. Obviously, 0 is upper bounded by 

Pr (E k ;~ r+ \0) > l,^ _r+1 (l) > 1,... ,E\~ r+l {t - 1) > l) . (8) 

While it is easy to derive and analyze recurrence equations of the expectations of the dominating 
Markov chain we will derive and analyze recurrence equation of the moment generating function 
of the dominating Markov chain 0 for precise analysis. By the analysis of the moment generating 
function in Section [6J asymptotic behavior of the moment generating function of E l ft) can be 
derived for t = @(£(n)). 

Theorem 10 (Moment generating function of E 1 (' t )). Assume m(n ) = f or arbitrary con- 

stant /j andlfn) £ w(l)no(n). Then, for any constants r > 0 and X, it holds E[exp{A£’ 1 (L r ^( n )J)}] = 
exp{(/9fc ]r (/x, A, r)l{n) + 0(max{l, l{n) 2 /n})} where iph,r(p, X,t) is defined in (fTj). 

The proof of TheoremllOlis shown in Section[6j Now, Theorem[8]can be proved by using TheoremflOl 
and the Chernoff bound. 


Proof of Theorem 0 From the Chernoff bound and Theorem [TUJ one obtains an inequality 

Pr (^f _r+1 (L^(n)J) > l) < Pr (E\~ r+1 {\rl(n)\) > o) 

< E[exp{A£’J ' +1 (L^(n)J)}] = exp{<p kj7 .(ii, A, r)£(n) + 0(max{l, £{n) 2 /n})} 

for any constants r > 0 and A > 0. It holds 

d(p k:r (n,X ,r) fk\ x 

--= fiexp{(k — r + 1)A}?’I I (1 + t) - r. 


If 


dpk,r(^A,T) 


d\ 


= fir 


A=0 


(1 + r) r_1 — r < 0 


then ipk, r (ii, X,t) is negative for sufficiently small A > 0 since 0, r) 

satisfied for some r > 0 when 

1 T 


fi < 


sk\ SU P n , \ 

r (r) T>0 ^ + ^ 


r —1 * 


(9) 

0. The condition 0 is 

( 10 ) 
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When r > 3, the supremum is taken at r = l/(r — 2), and hence the condition (11011 is equivalent to 


M < 


(r~2 Y 


— = n c {k,r). When p < p c (k,r), the inequality (0 is satisfied for any r G (r*, l/(r —2)]. 


C>p-ip 

That means that there exists (k— r+2)-stopping set of size at least n— |_(l+ r K( n )J with high probability 
for any r G (r*, l/(r — 2)]. By optimizing r and A, one obtains Theorem El □ 


5.3 Dominated Markov chain 

In this section, we prove Theorem [9l We can use the same argument as Lemma [TT] in Appendix ICl for 

7 *— 1 

the (k — r + 2)-peeling algorithm. For m(n ) = p y n y -^ , it holds 

E[Cfc_ r+ 2(0)] = m(n)p k - r+2 (n) = p( ^ ^Jn + 0(£(n)). 

Let as assume that there are E[C'fc_ r _|_2(0)] number of hyperedge nodes of degree k — r + 2 with high 
probability. In that case, if p > [r(r — 1)(^)] _1 , it holds E[C'fc_ r _ ) _2(0)] > ([(A: — r + 2)(k — r + 1)] _1 +5)n 
for sufficiently small 5 > 0. Then, from the argument in the proof of Lemma [Ml linearly many vertex 
nodes are removed by the (k — r + 2)-peeling algorithm with high probability. However, [r{r — 1)(^)] _1 
is strictly larger than /x c (k,r) for r > 3, and hence is not the sharp threshold. 

In the following, we will show that if p > p c {k, r), for any r) > 0 there exists t > 0 such that 

Pr ^- r+1 (0) > l,...,E l [- r+1 ([T£(n )J - 1) > l,f;f- r+1 (Lr^(n)J) > ^(n)) = 1 - o(l) (11) 

and that if p > p c (k,r), there exists sufficiently small e > 0 such that for any r > l/(r — 2), 

Pr (C k - r+ 2 (\Tl(n)\) > ([(fc - r + 2 ){k - r + 1)] _1 + e)n) = 1 - o(l). (12) 

If the iteration of the peeling algorithm continues until L r ^(^)J steps and if E^~ r+1 ([r£(n)\) > ?y£(n) 
and C k - r+ 2 ([Ti(n)\) > ([(A; — r + 2){k — r + l)]^ 1 + e)n hold, then from the argument in the proof of 
Lemma [T4l C k - r j r 2 (\jl(n )\) number of (A;—r+2)-uniform hyperedges generate a giant component of size 
(1 — p)n for some p G (0,1) with high probability. In that case, the peeling algorithm removes linearly 
many vertex nodes with probability at least 1 — p^^. Furthermore, from LemmaEl if m{n ) = co(n), 
there is no stopping set of linear size with high probability. The above argument implies that (HID and 
(I12D give the proof of Theorem [9] except for the bound of the probability. 

For lower bounding the probabilities in m and m, we consider a dominated Markov chain 

([EH- r+1 (t),C k _ r+2 (t),...,C k m t = 0,1.JV(0) which satisfies E^ +1 (0) = £-=: +1 jC'#), ^-(0) = 

Cj (0) for j = k — r + 2,..., k and 

C k {t + 1) = C_ k (t) — R k (t) 

C_j(t + 1) = C_j{t) - Rj(t) + R j+1 (t), for j = k — r + 2,k — r + 3,...,k — l (13) 
E\-r+\t + 1) = Et r+ \t) - 1 - R$- r+ \t) + {k-r + l)R k _ r+2 (t) 

where 

Rj(t) ~ Binom > for j = k - r + 2,k - r + 3,... ,k 

The probabilities (HID and (11211 can be lower bounded by replacing the original Markov chain by the 
dominated Markov chain. Indeed, the dominating Markov chain ([7]) is very close to the dominated 
Markov chain (fl3l) for t = 0(£(n)). 
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Theorem 11 (Moment generating function of r+1 (t)). Assume m{ri) = for arbitrary con¬ 

stant n and£(n) € w(l)no(n). Then, for any constants r > 0 and X, it holds E[exp{A£p r+1 (r4(n))}] = 
ex.p{(pk,r(hi -V r)£(n) + 0(max{l,£(n) 2 /n})} where (fik,r(h, A,r) is defined in (P). 

The proof is omitted since it is straightforward from the proof of Theorem [TDJ From Theorem [TTJ 
if fi> n c (k,r ), it holds 


Pr 




u 

t =o 


Ek-r+l 


(*)< o 


\rl{ri) J-l 

< ^ Pr (Si“ r+1 (i) < o) 

t=o 


\ri{n)\ — l 

< V inf E 

^ A<0 

t =0 

< exp < sup inf {</?*. r (/r, A, r 7 )} £(n) + 0(max{4(n) 2 /n, log^(n)}) 

l t '>0 ^<0 ’ 


ex P | XE% r+ 1 (t)j 


(14) 


Note that the above upper bound is independent of r. In the same way, one can show that if /r > 
/a c (k, r), for any 7] > 0 and any c > 0, there is sufficiently large r > 0, such that 

Pr (£f" r+1 ([r£(n)J) < rpn)^ < exp{— cl[n)}. 

Similarly to Theorem [Til asymptotic analysis of the moment generating function for C_ k _ r , 2 (*) is 
obtained for t = Q(£(n)). 

y_p 

Theorem 12 (Moment generating function of C_j(t)). Assume m(n ) = 3 ? f or arbitrary constant 

H and £{n) £ w(l) n o(n). Then, /or any / = A: — r + 2,..., k, for any constants r > 0 and A j, 


j—k+r—l 


E[exp{A i C j (Lr^(n)J)}] = expj j,r) £ ^_ k+r _ 2 


+ o 


n- 


j—k+r —1 


\ ■—n—r m ax < 1, 


£{n) 


n 


where 

:= /r (exp{A| — 1) ^ k _ ^ (1 + r) fc_ T 

The proof of this theorem is also omitted since it is straightforward from the proof of Theorem [10] 
From Theorem 1121 for any r > l/(r — 2), it holds 


Pr (C k _ r+2 {\rl{n)\) < (p - r + 2)(fc - r + 1)] 1 + e)n) 

< E [exp {\ k _ r+2 C k _ r+2 {[T£{n)\)}] 

~ exp{A fc _ r+2 (p - r + 2)(fc - r + l)]” 1 + e)n} 

< E [exp { Afc_ r+ 2 C k _ r+2 ([£(n)/{r - 2)])}] 

“ exp{A fc _ r+2 (p - r + 2 )(k - r + 1)] _1 + e)?r} 

= exp[ M (exp{A fc _ r+2 } - 1) (t _ r + 2 p_ r + 1) (*) 

- A fc _ r+2 (p - r + 2)(fc - r + 1)] _1 + e)n| 


r—1 


r—2 


-n 


= exp 


jp - r + 2)(k - r + 1)] x - ^ (exp{A fc _ r+2 } - l)n 
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- Xk-r +2 (p - r + 2)(k - r + 1)] 1 + e)n| 

for any A / c _ r+ 2 < 0. Hence, if /jl > /j, c (k,r), for sufficiently small e > 0, there is <5 > 0 such that 

Pr {Q.k-r+ 2 ^{ n ) / ( r ~ 2)) < ([(fe — r + 2)(fe — r + 1)] _1 + e)n) < exp{— Sn}. 

The probability that the peeling algorithm does not remove linearly many vertex nodes is dominated 
by (JHJ). By calculation of the saddle point, one obtains ([2]) and ([3]). 


6 Evolution of the moment generating function 


In this section, the proof of Theorem 1101 is shown. The moment generating function for [{E 1 ( t ) + 

t)/(k - r + 1), Ck- r + 2 (t), • • • , Ck(t)] is defined as 

f r+1) ■ • • > ^k) 

:= E exp (t) + t)/(k — r + 1) + Xk— r + 2 C k-r+ 2 {t) + '' • + 

From 0, one obtains a recursive formula 

/t+i(+ —r+ 15 • • • ? 


= E 


exp 


| Afc-r+l (El (t) + t) /[k — r + 1) + Xk-r+2Ck-r+2(t) + ' ' ' + AfcCfc 


GXp | Xk—r+lRk—r+2 (^) T Xk-r-\-2 Ek— r+3(t) + • • • + Afc_i Rk(t)} 


= E 


exp 

k 


n (i 

j=k— r+2 


|(-^1 (t) + t)/(k — T + 1) + Xk- r +2Ck-r+2{t) + ' ' ' + ^fcCfc 

\ CAt ) 

3.3 r\ \ 

+ 77777 exp{Aj_i} 


N(t) N(t) 

— ft{^k—r+ 1) +—r+2> • • • , A fc ) 


where 


A' ; =A,+log( 1 - ] ^ + ^exp{A j _ 1 } 

for j = k— r+2, k— r+3,... , k. Let A^2 r+ i : = Xk~ r +i for s = 1,2 ,..., t. For j = k—r+2, fc—r+3,. 
let A^ 0) := 0 and 


Xf := Xf^ + log ( 1 - 


3 


+ 


3 


N(t — s + 1) JV(t- s + lj eXp{A 5 T ,} 


for s = 1,2,..., t. Then, it holds 


-k— r+1 . 


E[exp{A fc _ r+ i(+i (t) + t)/(k - r + 1)}] = / t (A fc _ r+ i, 0,..., 0) 

— 7 \W \(*h 

— J 0 f A fc-r+l’ A fc-r+2> ■ ■ ■ > A k )' 


-,k, 


9 














Lemma 13. For t = 0(£(n)) and £{n) = o{n), it holds 


exp{A^} = 1 + 


3 


+ o 


k — r + 1J n 3 
i(ny- k+r ~ 2 

n j-k+r -1 


p-k+r-l 

- k+ r-i (exp{A fc _ r+ i} - 1) 


max < 1, 


£(n) 


n 


for j = k — r + 1, k — r + 2 ,..., k. 


Proof. The lemma is shown by induction on j. The lemma obviously holds for j = k — r + 1. Assume 
the lemma holds for j = jo — 1 > k — r + 1, then 


4 ’-EM 1 


t-i 


s =0 


3 o 


+ 


Jo 


t-1 

E 


Jo 


N(t — s) 


s=0 
t—1 . 

Jo 


N(t — s) N(t — s) 
explA^m } - l) + O 


■ ex P{Ajo-i} 


' £(n)‘ 2 ^°~ k+r ~ 1 ^~ 1 


E ^ ( ex P{ A io-i} - x ) +° 

s=0 
t- i 

E 


n 2(j 0 -fc+r-l) 

\ 

£(n) j °~ k+r ' 

n jo-k+r 


* 1 j n / 7n - I \ Jo-k+r-2 

)— 7 —FT- 9 (exp{Afc_ r _|_i} — 1) + O 

k — r + l) n J0-k+r-2 V Ul fc r+ii ^ 


s=0 


£(ra)* 


/0—fc+r-2 


-fc+r-1 maX 1^ 


jljo—k+r 


£(n) 


n 


Jo 

k — r + 1) n 30 


po—k+r—l 

_fc+r—1 ( ex P{ / ^fc—r+1} - 1) + O 


£(n) 3o ~ k+r ~ 2 ( £(n) 2 

—fc+r—l max <^1, 


n^ 0_ 


n 


□ 


Since the random variables [Co (0),... , Cfc(0)] at the initial step obey the multinomial distribution 

qr* _ p 

Multinom(m(n),po(^)i • • • iPk( n ))i it holds for t = \j£{n)\ and m{n) = // J\ r _ 2 that 


f \W \(*)\ 

3 0l A fc-r+l’ A fc-r+2’ ' ' ' > A k > 


k—r+l 


m(ri) 


3= 1 


= ( poW + E p f( n ) exp ) /~7TT Afc_r+1 [ + E p i( n ) exp { A J 




1+ E p j( n ) 

v j=k—r +1 


j 


+ o 


£(n 


,r-2 


k — r + l) n 3 ~ k+r 

m(n) 


j=k—r +2 

-jj—k-\-r—l 

j—k+r—l ( exp {Afc—r+l} - 1) 


n 


r—1 


max 


1 


n 


i+^ 7 («p{A t . r+ 1 }-i) x: 

v, j=k— r+l 



j / — r + l 


r* 


j—fc+r—1 


+ o 


£(n) r 2 f £(n) 2 

—j— max < 1, 


n' 


m(n) 


n 


= exp ^£(n)n (exp{A fc _ r+ i} - 1) ^ ^ ^ (1 + r) r 1 + O ^max |l, 


£(n) 


n 
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From 


-k— r+1 


E[exp{A(£; 1 ([r£(n)J) + |r^WJ)/(& - r + 1 )}] 


= exp 


£(n)fj, (exp{A} - 1 ) ^ (1 + r) r 1 + O ^max j 1, j 


one obtains 


E[exp{A£’i ' +1 (L r ^( n )J)}] = exp 


fj, (exp{(fc — r + 1)A} — 1) 

, w+ oL{i«} 


r — 1 


(1 + r ) r_1 — At 
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A Proof of Lemma [6] 


Let S(l) be a random variable corresponding to the number of d-stopping sets of size l in the randomly 
generated hypergraph. Then, the probability that the randomly generated hypergraph includes at 
least one a-linear d-stopping set is upper bounded by using Markov’s inequality as 


L(i—°0 n J 


L(l-o)nJ 


Pr ]T S(l ) > 1 < £ E [S(l)}. 


^Z=|"cm"|+1 


l=\ot.n\-\-l 


The expected number of d-stopping sets of size l is equal to 


E[S(l)] = 


n 


E 

is=Q,d,d-\-l,...,k 


0 (O 


m(n) 


GO 


Especially for d = 2, it holds 


E[S(l)} = 


n 


1 - 


<(?:!) r (n) 

© ) 


When m(n ) = yn for some constant 7 > 0, it holds 

i logE[5(<5n)] = HS) + 7log (l - k8( 1 - + o(l) 

for any 5 G (0,1) where h denotes the binary entropy function. Hence, for any fixed a G (0,1/2), there 
is a constant j a such that 


h(6)+ 7 a log (1 - k6(l - d)"- 1 ) 


< -1 


for any d G [a, 1 — a]. Hence, 


L(l-a)nJ 

y, E[5(Z)] < nexp{—n + o(n)} 

l=\an]-\-l 


when m(n) = 7 a n. 
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B Proof of Lemma [7] 

From an inequality 


one obtains for m(n ) 


log 

fin log n that 


€-1)V « 

(2) J “ (2) 


5n / v f 7 /7i—6n\ 'j / f /n—<5n\ 'j \ n 

J^E[5(0] < (Jj exp | —rn(n) [ ^ j < fl + exp < —m(n) -1 

= (l + n-^-S^+o^y _ J 

for any <5 e (0, a). Let <5 /t := 1 — 1 /{pk) l ^ k ~^. For any fi > 1/k and any 5 e (0, <5 M ), it holds 
//&(! — <5 ) fc_1 > 1 , i.e., 


1 + n 


—fik(l—S) k ~ 1 +o(l) 


— 1 = 0 


n 


l-fik(l-S) k 


C Analyses of stopping sets for r = 2 

In this section, the existence of a-large fc-stopping set is analyzed. Lemmain this section is used in 
Section [5731 For a-large /c-stopping set, that corresponds to the case r = 2, the threshold is obtained 
as follows. 

Lemma 14. For any fi > (, k{k — 1)) _1 , there exists a £ (0,1) such that Gk(n, fin,£(n)) does not have 
k-stopping set of size greater than an with probability exponentially close to 1 with respect to £{n). 

Proof. From the theory of random hypergraphs, if m{n ) = fin where fi > (k(k — l)) -1 , then the 
random hypergraph including n vertices and m(n ) hyperedges has a giant component, which is a 
connected component of size proportional to n, with probability approaching to 1 exponentially fast 
as n —>• oo m, m- It is also shown in [T2] that the size of giant component is concentrated around 
(1 — p)n where p € ( 0 , 1 ) is the unique solution of 

P = exp{fik(p k ~ l - 1)}. 

Hence, the probability that the size of giant component is greater than (1 — p — S)n tends to 1 expo¬ 
nentially fast with respect to n for any 6 > 0. If at least one of the £(n) vertices are included in the 
giant component, the giant component is removed by the fc-peeling algorithm. In that case, the size of 
the largest stopping set is at most (p + 5)n. The probability that all of the £(n) removed vertex nodes 
are not included in the giant component is at most (p + . □ 

The converse of Lemma fill is also obtained as follows. 

Lemma 15. For any p < (k(k-l))^ 1 , Gk(n, pn,£(n)) has k-stopping set of size larger than n — (1 + 
r)£{n) with high probability for any r strictly larger than 

k(k — l)p 
1 — k{k — 1 )fi 

Proof. The proof is almost same as the proof of Theorem [ 8 l In (HOI) , the supremum is taken at r —>• +oo 
when r = 2, and hence the condition (HOI) is equivalent to p < [k(k — l)] -1 . □ 
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